home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group98b.txt
/
000158_icon-group-sender _Tue Aug 25 08:21:48 1998.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
5KB
Return-Path: <icon-group-sender>
Received: from kingfisher.CS.Arizona.EDU (kingfisher.CS.Arizona.EDU [192.12.69.239])
by baskerville.CS.Arizona.EDU (8.9.1a/8.9.1) with SMTP id IAA16983
for <icon-group-addresses@baskerville.CS.Arizona.EDU>; Tue, 25 Aug 1998 08:21:48 -0700 (MST)
Received: by kingfisher.CS.Arizona.EDU (5.65v4.0/1.1.8.2/08Nov94-0446PM)
id AA15430; Tue, 25 Aug 1998 08:21:24 -0700
Message-Id: <35E20EF3.6DAEA82A@ix.netcom.com>
Date: Mon, 24 Aug 1998 21:10:12 -0400
From: Phillip Lee Thomas <teruthom@ix.netcom.com>
Reply-To: thomaspl@acm.org
X-Mailer: Mozilla 4.05 [en] (Win95; U)
Mime-Version: 1.0
To: "Dr. Louis A. Turk" <laturk@ibm.net>, icon-group@optima.CS.Arizona.EDU
Subject: Re: Why doen't this work?
References: <2.2.32.19980822053809.003390d8@pop5.ibm.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit
Errors-To: icon-group-errors@optima.CS.Arizona.EDU
Content-Transfer-Encoding: 7bit
Status: RO
Content-Length: 3736
I'd have to see how this works against a piece of text, but you have a problem in the first
few lines if a single line has the shape: <P>some text</P> as you write this line out and
then read in further lines until you find one ending in </P>. Looking at this rather
briefly, I would think you would produce a number of lines glued together as "<P>some
text</P> some more text and some more and <P> something that meets condition 2</P>".
Secondly, HTML glues the whole thing together for parsing purposes so it is quite possible
that in a single input line you could get several "<P>...</P>" sequences.
I suggest that you read the whole document in with a single reads() if this doesn't blow
your memory. Then use map(line,"\n"," ") to convert new lines to spaces, and finally do a
string scan along the line, chopping off pieces as you go:
megaline ? {
while line :=tab(find("<P>")) do {
line ||:= tab(find("</P>") +4)
write(out, line)
} # while in <P>...</P>
} # while scan
You'd have to fiddle a bit with this but it's close to being right.
Sound possible?
Phillip Thomas
Dr. Louis A. Turk wrote:
> Can anybody tell me why this code only removes CR/LF's every other paragraph
> that contains them? Why
> does it skip a paragraph?
>
> Louis
>
> Obviously, there will be more to this program, once I get past this problem.
>
> ############################################################################
> #############
> #
> # HTML TO Nota Bene 4.5 FILTER
> # Ver. 1.0 Aug.
> # Programmer: Louis A. Turk
> #
> # USE: Coverts HTML to Note Bene using two passes. FIRST PASS:
> # 1. Removes the CR/LF's between <P> and </P>
> # 2. Removes the CR/LF's between <UL> and </UL> and also removes right
> indention.
> # SECOND PASS:
> # 3. Replaces all HTML code with Nota Bene code.
> #
> ############################################################################
> ##############
>
> link graphics
>
> procedure main(arg)
>
> WOpen("size=1005,850")
>
> infile := arg[1]
> outfile := arg[2]
> tempfile := "temp3.txt"
>
> in := open(infile,"r") | stop("Can't open file: ",in)
> out := open(outfile,"w") | stop("Can't open file: ",out)
> tmp := open(tempfile,"c") | stop("Can't open file: ",tmp)
>
> #### FIRST PASS: REMOVE EXCESS CR/LF's
> ######################################
>
> while line := read(in) do {
> if find(line,"<P>") then { # Beginning of
> defective code
> WWrites(line," ")
> writes(tmp,line," ")
> until find(line := read(in),"</P>") do {
> WWrites(line," ")
> writes(tmp,line," ")
> }
> WWrite(line)
> write(tmp,line)
> }
> else if find(line,"<UL>") then {
> WWrites(line," ")
> writes(tmp,line," ")
> until find(line := read(in),"</UL>") do {
> WWrites(line," ")
> writes(tmp,line," ")
> }
> WWrite(line)
> write(tmp,line)
> } # End of defective
> code
> else {
> WWrite(line)
> write(tmp,line)
> }
>
> }
> ##### SECOND PASS: #######################################
> Event()
> end